17 research outputs found

    Linear-time Online Action Detection From 3D Skeletal Data Using Bags of Gesturelets

    Full text link
    Sliding window is one direct way to extend a successful recognition system to handle the more challenging detection problem. While action recognition decides only whether or not an action is present in a pre-segmented video sequence, action detection identifies the time interval where the action occurred in an unsegmented video stream. Sliding window approaches for action detection can however be slow as they maximize a classifier score over all possible sub-intervals. Even though new schemes utilize dynamic programming to speed up the search for the optimal sub-interval, they require offline processing on the whole video sequence. In this paper, we propose a novel approach for online action detection based on 3D skeleton sequences extracted from depth data. It identifies the sub-interval with the maximum classifier score in linear time. Furthermore, it is invariant to temporal scale variations and is suitable for real-time applications with low latency

    Toward Flare-Free Images: A Survey

    Full text link
    Lens flare is a common image artifact that can significantly degrade image quality and affect the performance of computer vision systems due to a strong light source pointing at the camera. This survey provides a comprehensive overview of the multifaceted domain of lens flare, encompassing its underlying physics, influencing factors, types, and characteristics. It delves into the complex optics of flare formation, arising from factors like internal reflection, scattering, diffraction, and dispersion within the camera lens system. The diverse categories of flare are explored, including scattering, reflective, glare, orb, and starburst types. Key properties such as shape, color, and localization are analyzed. The numerous factors impacting flare appearance are discussed, spanning light source attributes, lens features, camera settings, and scene content. The survey extensively covers the wide range of methods proposed for flare removal, including hardware optimization strategies, classical image processing techniques, and learning-based methods using deep learning. It not only describes pioneering flare datasets created for training and evaluation purposes but also how they were created. Commonly employed performance metrics such as PSNR, SSIM, and LPIPS are explored. Challenges posed by flare's complex and data-dependent characteristics are highlighted. The survey provides insights into best practices, limitations, and promising future directions for flare removal research. Reviewing the state-of-the-art enables an in-depth understanding of the inherent complexities of the flare phenomenon and the capabilities of existing solutions. This can inform and inspire new innovations for handling lens flare artifacts and improving visual quality across various applications

    Learning the manifolds of local features and their spatial arrangements

    No full text
    Local features play an important role for many computer vision problems; they are highly discriminative and possess invariant properties. However, the spatial configuration of local features plays an essential role in recognition. Spatial neighborhoods capture local geometry and collectively provide shape information about a given object. In this dissertation we studied explicit and implicit ways to exploit the joint feature-spatial arrangement in images for recognition problems. We introduce a framework to learn an embedded representation of images that captures the similarity between features and the spatial arrangement information. The framework was successfully applied in object recognition and localization context. The framework was also applied for feature matching across multiple images. We also showed the viability of the framework in regression from local features for viewpoint estimation. We also studied implicit ways to exploit the feature-spatial manifold structure in the data without explicit embedding and within a transductive learning paradigm for object localization. We learned the labels of the local features from an object class in a manner that provides spatial and feature smoothing over the labels. To achieve that we adapted the Global and Local Consistency Solution for Label Propagation to our implicit manifold model to infer the labels of local features. We showed excellent accuracy rates with very low false positive rates on the learned features labels in the test images.Ph. D.Includes bibliographical referencesby Marwan Tork

    One-Shot Multi-Set Non-rigid Feature-Spatial Matching

    No full text
    We introduce a novel framework for nonrigid feature matching among multiple sets in a way that takes into consideration both the feature descriptor and the features spatial arrangement. We learn an embedded representation that combines both the descriptor similarity and the spatial arrangement in a unified Euclidean embedding space. This unified embedding is reached by minimizing an objective function that has two sources of weights; the feature spatial arrangement and the feature descriptor similarity scores across the different sets. The solution can be obtained directly by solving one Eigen-value problem that is linear in the number of features. Therefore, the framework is very efficient and can scale up to handle a large number of features. Experimental evaluation is done using different sets showing outstanding results compared to the state of the art; up to 100 % accuracy is achieved in the case of the well known ‘Hotel ’ sequence. 1

    Regression from Local Features for Viewpoint and Pose Estimation

    No full text
    In this paper we propose a framework for learning a regression function form a set of local features in an image. The regression is learned from an embedded representation that reflects the local features and their spatial arrangement as well as enforces supervised manifold constraints on the data. We applied the approach for viewpoint estimation on a Multiview car dataset, a head pose dataset and arm posture dataset. The experimental results show that this approach has superior results (up to 67 % improvement) to the state-of-the-art approaches in very challenging datasets. 1

    Putting local features on a manifold

    No full text
    Local features have proven very useful for recognition. Manifold learning has proven to be a very powerful tool in data analysis. However, manifold learning application for images are mainly based on holistic vectorized representa-tions of images. The challenging question that we address in this paper is how can we learn image manifolds from a punch of local features in a smooth way that captures the feature similarity and spatial arrangement variability be-tween images. We introduce a novel framework for learn-ing a manifold representation from collections of local fea-tures in images. We first show how we can learn a feature embedding representation that preserves both the local ap-pearance similarity as well as the spatial structure of the features. We also show how we can embed features from a new image by introducing a solution for the out-of-sample that is suitable for this context. By solving these two prob-lems and defining a proper distance measure in the feature embedding space, we can reach an image manifold embed-ding space. 1

    QU-IR at SemEval 2016 Task 3: Learning to rank on Arabic community question answering forums with word embedding

    Get PDF
    Resorting to community question answering (CQA) websites for finding answers has gained momentum in the past decade with the explosive rate at which social media has been proliferating. With many questions left unanswered on those websites, automatic and smart question answering systems have seen light. One of the main objectives of such systems is to harness the plethora of existing answered questions; hence transforming the problem to finding good answers to newly posed questions from similar previously-answered ones. As SemEval 2016 Task 3 "Community Question Answering" has focused on this problem, we have participated in the Arabic Subtask. Our system has adopted a supervised learning approach in which a learning-to-rank model is trained over data (questions and answers) extracted from Arabic CQA forums using word2vec features generated from that data. Our primary submission achieved a 29.7% improvement over the MAP score of the baseline. Post submission experiments were further conducted to integrate variations of the word2vec features to our system. Integrating covariance word embedding features has raised the the improvement over the baseline to 37.9%. 2016 Association for Computational Linguistics.This work was made possible by NPRP grant# NPRP 6-1377-1-257 from the Qatar National Research Fund (a member of Qatar Foundation).Scopu

    Learning a Joint Manifold Representation from Multiple Data Sets

    No full text
    Abstract—The problem we address in the paper is how to learn a joint representation from data lying on multiple manifolds. We are given multiple data sets and there is an underlying common manifold among the different data set. We propose a framework to learn an embedding of all the points on all the manifolds in a way that preserves the local structure on each manifold and, in the same time, collapses all the different manifolds into one manifold in the embedding space, while preserving the implicit correspondences between the points across different data sets. The proposed solution works as extensions to current state of the art spectral-embedding approaches to handle multiple manifolds. I
    corecore